The AI Economist: Optimal Economic Policy Design via Two-level Deep Reinforcement Learning
نویسندگان
چکیده
AI and reinforcement learning (RL) have improved many areas, but are not yet widely adopted in economic policy design, mechanism or economics at large. At the same time, current methodology is limited by a lack of counterfactual data, simplistic behavioral models, opportunities to experiment with policies evaluate responses. Here we show that machine-learning-based simulation powerful design framework overcome these limitations. The Economist two-level, deep RL trains both agents social planner who co-adapt, providing tractable solution highly unstable novel two-level challenge. From simple specification an economy, learn rational agent behaviors adapt learned vice versa. We demonstrate efficacy on problem optimal taxation. In one-step economies, recovers tax theory. complex, dynamic substantially improves utilitarian welfare trade-off between equality productivity over baselines. It does so despite emergent tax-gaming strategies, while accounting for interactions change more accurately than theory. These results first time can be used understanding as complement theory unlocking new computational learning-based approach policy.
منابع مشابه
Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization
Reinforcement learning can acquire tcomplex behaviors from high-level specifications. However, defining a cost function that can be optimized effectively and encodes the correct task is challenging in practice. We explore how inverse optimal control (IOC) can be used to learn behaviors from demonstrations, with applications to torque control of high-dimensional robotic systems. Our method addre...
متن کاملOptimal Economic Design through Deep Learning (Short paper)∗
Designing an auction that maximizes expected revenue is an intricate task. Despite major efforts, only the single-item case is fully understood. We explore the use of tools from deep learning on this topic. The design objective is revenue optimal, dominant-strategy incentive compatible auctions. For a baseline, we show that multi-layer neural networks can learn almost-optimal auctions for a var...
متن کاملOptimal policy switching algorithms for reinforcement learning
We address the problem of single-agent, autonomous sequential decision making. We assume that some controllers or behavior policies are given as prior knowledge, and the task of the agent is to learn how to switch between these policies. We formulate the problem using the framework of reinforcement learning and options (Sutton, Precup & Singh, 1999; Precup, 2000). We derive gradient-based algor...
متن کاملOn-Policy vs. Off-Policy Updates for Deep Reinforcement Learning
Temporal-difference-based deep-reinforcement learning methods have typically been driven by off-policy, bootstrap Q-Learning updates. In this paper, we investigate the effects of using on-policy, Monte Carlo updates. Our empirical results show that for the DDPG algorithm in a continuous action space, mixing on-policy and off-policy update targets exhibits superior performance and stability comp...
متن کاملShared Autonomy via Deep Reinforcement Learning
In shared autonomy, user input is combined with semi-autonomous control to achieve a common goal. The goal is often unknown ex-ante, so prior work enables agents to infer the goal from user input and assist with the task. Such methods tend to assume some combination of knowledge of the dynamics of the environment, the user’s policy given their goal, and the set of possible goals the user might ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Social Science Research Network
سال: 2021
ISSN: ['1556-5068']
DOI: https://doi.org/10.2139/ssrn.3900018